Keir Fraser [Tue, 14 Dec 2010 09:54:10 +0000 (09:54 +0000)]
x86/iommu: account for necessary allocations when calculating Dom0's initial allocation size
As of c/s 21812:
e382656e4dcc, IOMMU related allocations for Dom0
happen only after it got all of its memory allocated, and hence the
reserve (mainly for setting up its swiotlb) may get exhausted without
accounting for the necessary allocations up front.
While not precise, the estimate has been found to be within a couple
of pages for the systems it got tested on.
For the calculation to be reasonably correct, this depends on the
patch titled "x86/iommu: don't map RAM holes above 4G" sent out
yesterday.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 14 Dec 2010 09:53:41 +0000 (09:53 +0000)]
amd-iov: eliminate open-coded PCI bus scan
Instead, use scan_pci_devices() just like VT-d does. This at once
allows making {alloc,free}_pdev() static.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 14 Dec 2010 09:52:57 +0000 (09:52 +0000)]
x86/iommu: don't map RAM holes above 4G
Matching the comment in iommu_set_dom0_mapping(), map only actual RAM
from the address range starting at 4G. It's not clear though whether
that comment is actually correct (which is why I'm sending this as
RFC), but it is certain that on systems with sparse physical memory
map we're currently wasting a potentially significant amount of memory
for setting up IOMMU page tables that will never be used.
The main question is what happens for MMIO ranges living above 4G. Of
course, the same issue would currently exist for any such ranges
sitting beyon the end of RAM.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Juergen Gross [Thu, 9 Dec 2010 13:50:56 +0000 (14:50 +0100)]
add missing libxl__free_all() calls
In various libxl functions libxl__free_all() was missing before return
Signed-off-by: juergen.gross@ts.fujitsu.com
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Anthony PERARD [Mon, 13 Dec 2010 17:59:45 +0000 (17:59 +0000)]
libxl: Makes libxl be able to call Qemu upstream for XenPV guest.
In libxl_build_device_model_args_new:
- Adds -xen-attach options to the list of arguments to Qemu.
- Adds -vga xenfb options when vnc and sdl are not set.
- Remove disk list from the command line for XenPV as they will be
read from xenstore by Qemu.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
tools/libxl/libxl.c | 30 +++++++++++++++++++++---------
1 files changed, 21 insertions(+), 9 deletions(-)
Anthony PERARD [Mon, 13 Dec 2010 17:59:02 +0000 (17:59 +0000)]
libxl: strdup disk path before put it in qemu args array.
In libxl_build_device_model_args_new, the path to the disk image are
freeed before there was actually use to make the arguments list of Qemu.
The patch strdups it.
This patch also changes argv[0] of the device model.
Now, it is the conventional argv[0], so is value come from
info->device_model.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
tools/libxl/libxl.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)
Anthony PERARD [Mon, 13 Dec 2010 17:58:20 +0000 (17:58 +0000)]
libxl: fix double free of ifname, when makes args for qemu.
In libxl_build_device_model_args_new, vifs[i].ifname can be free two
times, by the gc, and by freeing the vifs structures. This patch avoids
this.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
tools/libxl/libxl.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
Chunyan Liu [Mon, 13 Dec 2010 17:39:27 +0000 (17:39 +0000)]
Fix /vm/uuid xenstore leak on tapdisk2 device cleanup
While doing block-detach blktap2 disk, tap2 device info is not cleared from
xenstore /vm/uuid/xxx. The reason is in xen-hotplug-cleanup script: when $vm_dev
does not exist, $(xenstore-read "$vm_dev" 2>/dev/null) is also "", won't enter
the block. So, change to use cmd return value to check existence.
Signed-off-by Chunyan Liu <cyliu@novell.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Stefano Stabellini [Mon, 13 Dec 2010 17:15:31 +0000 (17:15 +0000)]
Adds an open xenstore connection function which tries to use the xenbus
interface (xs_domain_open) when the socket interface (xs_daemon_opn)
fails.
Signed-off-by: Mihir Nanavati <mihirn@cs.ubc.ca>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Thu, 2 Dec 2010 12:49:00 +0000 (12:49 +0000)]
libxc: allow caller to specify no re-entrancy protection when opening the interface
Used by language bindings which provide their own re-entrancy which conflicts
with pthreads.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Thu, 2 Dec 2010 12:49:00 +0000 (12:49 +0000)]
libxc: rename safe_strerror to xc_strerror and pass in an XC handle for future use.
Make the function public since I have future patches which depend on this.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Thu, 2 Dec 2010 12:33:01 +0000 (12:33 +0000)]
libxc: NetBSD: implement xc_evtchn_bind_unbound_port.
Doesn't actually appear to be used anywhere but is defined for other
OSes.
The NetBSD evtchn.h contains comments "Return allocated port" for
several ioctls which currently return the allocated port as a member
of the argument structure and not as the ioctl return value (I think
this is a cut and paste error).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christoph Egger <Christoph.Egger@amd.com>
committer: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 19:38:30 +0000 (19:38 +0000)]
etherboot: Download ipxe from XEN_EXTFILES_URL
Allows us to build even if the ipxe git server is down. We still fall
back to the git server if we cannot download a suitably named tarball
from our own URL.
Signed-off-by: Keir Fraser <keir@xen.org>
Ian Campbell [Fri, 10 Dec 2010 18:44:07 +0000 (18:44 +0000)]
tools/hotplug/Linux: Ensure tap devices receive a dummy MAC address.
If a tap device is not given an explicit MAC address it will generate
one randomly.
The behaviour of the Linux bridge is to pickup the lowest MAC address
of any port for use in for ARP, STP etc. If the tap device's randomly
generated MAC address happens to be the lowest then this can cause all
manner of strange networking glitches in both domain 0 and guests when
the bridge suddenly takes over from the previously used MAC address.
We choose FE:FF:FF:FF:FF:FF as it the numerically largest
non-broadcast address. This ensures that the physical NIC device's
port will have the lowest MAC address and therefore be the one picked
up by the bridge.
vif devices already have a MAC address of FE:FF:FF:FF:FF:FF set by
netback already but there is no harm in forcing it a second time in
the hotplug script.
tap devices are added by the "add" event and therefore we should call
setup_bridge_port then as well as for "online" which is caused by vif
devices.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 10 Dec 2010 18:30:45 +0000 (18:30 +0000)]
libxl: "Device model is a stubdom" is not an error
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 10 Dec 2010 18:29:50 +0000 (18:29 +0000)]
libxl: return an error if connection to xenstore fails
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 10 Dec 2010 18:27:54 +0000 (18:27 +0000)]
Merge
Jan Beulich [Fri, 10 Dec 2010 18:27:17 +0000 (18:27 +0000)]
libxc: correct bounce direction for debug-key handling
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 18:20:25 +0000 (18:20 +0000)]
mem_event: Remove unused fields and definitions.
Signed-off-by: Keir Fraser <keir@xen.org>
Wei Gang [Fri, 10 Dec 2010 18:16:07 +0000 (18:16 +0000)]
tools/ocaml: fix a typo in xc_lib.c:xc_domain_set_vpt_align
Fix a typo in xc_lib.c.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Fri, 10 Dec 2010 18:13:15 +0000 (18:13 +0000)]
tools/python: Rebuild python extensions if depends have changed
Adds depends information for building python extensions.
The extensions depend on the library binaries they are using.
Signed-off-by: juergen.gross@ts.fujitsu.com
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Fri, 10 Dec 2010 18:08:19 +0000 (18:08 +0000)]
xend: fix "xm block-detach 0 ..." for extended-ID devices
Simply taking stat()'s st_rdev doesn't work here, as the minor is
split into two parts, the major is present, and the "extended" bit
isn't set.
Rather than fixing this in a way that would likely be OS-dependent,
simply remove the access to the device file, and instead just parse
the provided string (as is done e.g. for block-attach).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 16:49:05 +0000 (16:49 +0000)]
hvm vlapic: Fix tmcct read logic when in periodic mode.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 10 Dec 2010 16:40:05 +0000 (16:40 +0000)]
x86: acpi: Fix reboot attempt sequence.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 10 Dec 2010 11:32:19 +0000 (11:32 +0000)]
x86 acpi: Follow Windows behaviour more closely during reset.
This follows some changes proposed for upstream Linux:
1. Do not check the FADT reset register size/offset
2. Try ACPI poking twice during our reset attempt sequence
Hopefully this will help us reset reliably on a wider range of
platforms.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 10 Dec 2010 11:01:19 +0000 (11:01 +0000)]
tmem: Use of 'new' clashes with C++ reserved namespace.
Rename to 'creat', which does not conflict.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 10 Dec 2010 10:51:31 +0000 (10:51 +0000)]
credit2: Implement cross-L2 migration "resistance"
Resist moving a vcpu from one L2 to another unless its credit
is past a certain amount.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 10:51:04 +0000 (10:51 +0000)]
credit2: Don't migrate cpus unnecessarily
Modern processors have one or two L3's per socket, and generally only
one core per L2. Credit2's design relies on having credit shared
across several. So as a first step to sharing a queue across L3 while
avoiding excessive cross-L2 migration
Step one: If the vcpu's current cpu is acceptable, just run it there.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 10:50:16 +0000 (10:50 +0000)]
credit2: Trace rdtsc value with some more records
Makes the traces slightly larger, but easier to analyze.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 10:49:48 +0000 (10:49 +0000)]
credit2: Use vcpu processor explicitly
No functional changes.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 10 Dec 2010 10:49:20 +0000 (10:49 +0000)]
credit2: Putting a vcpu to sleep also removes the delayed_runq_add flag
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Thu, 9 Dec 2010 19:19:34 +0000 (19:19 +0000)]
x86: x2apic: Large cleanup
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 9 Dec 2010 16:17:33 +0000 (16:17 +0000)]
Add CPU_STARTING notifier during CPU bringup.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 9 Dec 2010 16:15:10 +0000 (16:15 +0000)]
x86: time: tsc_set_info() must skip the idle domain.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 9 Dec 2010 10:09:59 +0000 (10:09 +0000)]
Move IDLE_DOMAIN_ID defn to public header, and change DOMID_INVALID to fix clash.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 9 Dec 2010 09:57:08 +0000 (09:57 +0000)]
x86: Simplify tsc_set_info() slightly -- no domain has id DOMID_INVALID.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 9 Dec 2010 08:34:59 +0000 (08:34 +0000)]
x86:vlapic: Fix possible guest tick losing after save/restore
Guest vcpu may totally lose all ticks if the vlapic->pt.irq was not
restored during save/restore process. Fix it.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Keir Fraser [Thu, 9 Dec 2010 08:34:04 +0000 (08:34 +0000)]
Fix xc_cpuid_hvm_policy to avoid guest CPUID feature missing.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Keir Fraser [Thu, 9 Dec 2010 08:30:30 +0000 (08:30 +0000)]
sched/arinc653: fix another unsigned < 0 comparison
replacing it with a test of the appopriate unsigned max.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Tim Deegan [Wed, 8 Dec 2010 10:46:31 +0000 (10:46 +0000)]
x86/mm: change ASSERTs to BUG_ONs in mem_sharing.c
These two ASSERTs have important side-effects so make them into BUG_ONs
consistent with the rest of the file.
Bug found by Jui-Hao Chiang <juihaochiang@gmail.com>.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Tue, 7 Dec 2010 18:32:04 +0000 (18:32 +0000)]
x86: remove BUG_ON() from QUIRK_IOAPIC_*_REGSEL handler
Since (non-pvops, 32-bit only up to 2.6.27) Linux would report "BAD"
unconditionally on all SiS chipset versions (it only looks for a PCI
device at 0000:00:00.0 with SiS as the vendor), we must not crash if
the report on a 64-bit hypervisor doesn't match the #define (which is
zero).
While we could honor the quirk indication even on 64-bit, it doesn't
seem worthwhile, as there's no evidence that newer SiS chipsets
(supporting 64-bit CPUs) are actually affected.
This should also address bug 1687 (mis-reported, however, afaict).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 7 Dec 2010 18:30:19 +0000 (18:30 +0000)]
svm: dump VMCB physical address
VMCB physical address is useful for hardware debug. This small patch
dumps VMCB physical address.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Tue, 7 Dec 2010 18:28:19 +0000 (18:28 +0000)]
amd xsave: Enable SVM intercept for xsetbv instruction.
SVM introduces an intercept control bit for xsetbv instruction. This
patches enables xsetbv intercept for SVM.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Tue, 7 Dec 2010 18:27:40 +0000 (18:27 +0000)]
amd xsave: Enable XSAVE/XRSTOR for SVM guest
This patch creates a common interface hanlding xsetbv.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Tue, 7 Dec 2010 18:26:38 +0000 (18:26 +0000)]
amd xsave: Move xsave initialization code to a common place
This patch moves xsave/xrstor code to CPU common file. First of all,
it prepares xsave/xrstor support for AMD CPUs. Secondly, Xen would
crash on __context_switch() without this patch on xsave-capable AMD
CPUs. The crash was due to cpu_has_xsave reports true in domain.c
while xsave space wasn't initialized.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Tue, 7 Dec 2010 18:24:12 +0000 (18:24 +0000)]
x86 hvm: x2APIC emulation
This patch would enable Xen to handle x2APIC MSR accessing of HVM
guest, which is faster(avoid decoding of MMIO accessing). The credit
comes to Gleb Natapov who complete the work for KVM.
Have tested with 4 vcpus guest, with/without x2apic support.
From: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Tue, 7 Dec 2010 18:10:46 +0000 (18:10 +0000)]
x86 hvm: Fix VLAPIC TMCCT register when timer is one-shot
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 3 Dec 2010 06:37:48 +0000 (06:37 +0000)]
minios: reverse layering of xc vs minios fd close
Having minios close() call back into the libxc core close routines is
backwards and unexpected. On every other OS the libxc core close
routine calls close().
Export minios specific functions from the minios libxc code to
implement fd closing for each type of xc file handle and simply call
close() in the core close routine.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Fri, 3 Dec 2010 06:36:41 +0000 (06:36 +0000)]
MAINTAINERS: Correct Winston Wang's email address.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Fri, 3 Dec 2010 06:35:40 +0000 (06:35 +0000)]
xen: x86_32p: fix build breakage from 22456:
1b6cc8c6d1c7
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Thu, 2 Dec 2010 09:37:04 +0000 (09:37 +0000)]
x86: Add -Wredundant-decls to Xen build flags.
Fix up the fallout.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Wed, 1 Dec 2010 21:20:14 +0000 (21:20 +0000)]
ARINC 653 scheduler
From: Josh Holtrop <Josh.Holtrop@dornerworks.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Wed, 1 Dec 2010 20:12:12 +0000 (20:12 +0000)]
x86/IRQ: pass CPU masks by reference rather than by value in more places
Additionally simplify operations on them in a few cases.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 1 Dec 2010 20:11:30 +0000 (20:11 +0000)]
x86/IRQ: consolidate declarations
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 1 Dec 2010 20:10:27 +0000 (20:10 +0000)]
x86: fix IRQ migration when using directed EOI (broken with c/s 20465)
In directed-EOI mode, there is no chance to do the migration in
mask_and_ack_level_ioapic_irq(), as the remote IRR bit can't possibly
be clear after issuing the EOI to the LAPIC. Consequently, there's no
point to even try. Instead, migration must be done in
end_level_ioapic_irq(), and it requires masking the interrupt source
prior to issuing the EOI to the IO-APIC.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 30 Nov 2010 11:34:08 +0000 (11:34 +0000)]
x86 hvm: Do not overwrite boot-cpu capability data on VMX/SVM startup.
Apparently required back in the earliest days of Xen, we now properly
initialise CPU capabilities early during bootstrap. Re-writing
capability data later now causes problems if specific features have
been deliberately masked out.
Thanks to Weidong Han at Intel for finding such a bug where XSAVE
feature is masked out by default, but then erroneously written back
during VMX initialisation. This would cause memory corruption problems
during boot for XSAVE-capable systems.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Mon, 29 Nov 2010 17:44:32 +0000 (17:44 +0000)]
x86: fixes after emuirq changes
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Keir Fraser [Mon, 29 Nov 2010 14:40:55 +0000 (14:40 +0000)]
x86: tighten filter on ptwr_do_page_fault()
Even not-so-recent Linux may, due to post-2.6.18 changes to the
process creation code, cause quite a number (depending on environment
and argument size) of faulting accesses to user space originating from
kernel mode. Generally those happen for non-present pages and would
lead to a nested page fault from guest_get_eff_l1e(). They can be
avoided by checking for PFEC_page_present as long as the guest isn't
running on shadow page tables.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Mon, 29 Nov 2010 14:34:32 +0000 (14:34 +0000)]
x86-64: don't crash Xen upon direct pv guest access to GDT/LDT mapping area
handle_gdt_ldt_mapping_fault() is intended to deal with indirect
accesses (i.e. those caused by descriptor loads) to the GDT/LDT
mapping area only. While for 32-bit segment limits indeed prevent the
function being entered for direct accesses (i.e. a #GP fault will be
raised even before the address translation gets done, on 64-bit even
user mode accesses would lead to control reaching the BUG_ON() at the
beginning of that function.
Fortunately the fix is simple: Since the guest kernel runs in ring 3,
any guest direct access will have the "user mode" bit set, whereas
descriptor loads always do the translations to access the actual
descriptors as kernel mode ones.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Further, relax the BUG_ON() in handle_gdt_ldt_mapping_fault() to a
check-and-bail. This avoids any problems in future, if we don't
execute x86_64 guest kernels in ring 3 (e.g., because we use a
lightweight HVM container).
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 26 Nov 2010 14:25:30 +0000 (14:25 +0000)]
xenpaging: increase recently used pages from 4MB to 64MB
Increase recently used pages from 4MB to 64MB. Keeping more pages in
memory allows the guest to make more progress if the paging file spans
the entire guest memory.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:23:31 +0000 (14:23 +0000)]
xenpaging: handle temporary out-of-memory conditions during page-in
p2m_mem_paging_prep() should return -ENOMEM if a new page could not be
allocated. This can be handled in xenpaging to retry the
page-in. Right now such condition would stall the guest because the
requested page will not come back, xenpaging simply exits. So
xenpaging could very well retry the allocation forever to rescue the
guest.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:22:38 +0000 (14:22 +0000)]
xenpaging: print xenpaging cmdline options
Print xenpaging arguments to simplify domain_id mapping from xenpaging
logfile to other logfiles and Xen console output.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:22:11 +0000 (14:22 +0000)]
xenpaging: optimize p2m_mem_paging_populate
p2m_mem_paging_populate will always put another request in the ring.
To reduce pressure on the ring, place only required requests in the
ring. If the gfn was already processed by another thread, and the
current vcpu does not need to be paused, p2m_mem_paging_resume will do
nothing with the request. And also xenpaging will drop the request if
the vcpu does not need a wakeup.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:21:33 +0000 (14:21 +0000)]
xenpaging: when populating a page, check if populating is already in progress
p2m_mem_paging_populate can be called serveral times from different
vcpus. If the page is already in state p2m_ram_paging_in and has a new
valid mfn, invalidating this new mfn will cause trouble later if
p2m_mem_paging_resume will set the new gfn/mfn pair back to state
p2m_ram_rw. Detect this situation and keep p2m state if the page is
in the process of being still paged-out or already paged-in. In fact,
p2m state p2m_ram_paged is the only state where the mfn type can be
invalidated.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:20:39 +0000 (14:20 +0000)]
xenpaging: allow negative num_pages and limit num_pages
Simplify paging size argument. If a negative number is specified, it
means the entire guest memory should be paged out. This is useful for
debugging. Also limit num_pages to the guests max_pages.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:20:10 +0000 (14:20 +0000)]
xenpaging: notify policy only on resume
If a page is requested more than once, the policy is also notified
more than once about the page-in. However, a page-in happens only
once. Any further resume will only unpause the other vcpu. The
multiple notify will put the page into the mru list multiple times and
it will unlock other already resumed pages too early. In the worst
case, a page that was just resumed can be evicted right away, causing
a deadlock in the guest.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:19:38 +0000 (14:19 +0000)]
xenpaging: print p2mt for already paged-in pages
Add more debug output, print p2mt for pages which were requested more
than once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:19:09 +0000 (14:19 +0000)]
xenpaging: print info when free request slots drop below 2
Add debugging aid to free request slots in the ring buffer.
It should not happen that the ring gets full, print info anyway if it
happens.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:18:32 +0000 (14:18 +0000)]
xenpaging: add signal handling
Leave paging loop if xenpaging gets a signal.
Remove paging file on exit.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:17:56 +0000 (14:17 +0000)]
xenpaging: populate paged-out pages unconditionally in grant code
Populate a page unconditionally to avoid missing a page-in request.
If the page is already in the process of being paged-in, the this vcpu
will be stopped and later resumed once the page content is usable
again.
This matches other p2m_mem_paging_populate usage in the source tree.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Fri, 26 Nov 2010 14:17:01 +0000 (14:17 +0000)]
xenpaging: allow only one xenpaging binary per guest
Make sure only one xenpaging binary is active per domain.
Print info when the host lacks the required features for xenpaging.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 26 Nov 2010 14:15:59 +0000 (14:15 +0000)]
xenpaging: Open paging file only if xenpaging_init() succeeds
Open paging file only if xenpaging_init() succeeds. It can fail if the
host does not support the required virtualization features such as EPT
or if xenpaging was already started for this domain_id.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 26 Nov 2010 14:15:21 +0000 (14:15 +0000)]
xenpaging: break endless loop during inital page-out with large pagefiles
To allow the starting for xenpaging right after 'xm start XYZ', I
specified a pagefile size equal to the guest memory size in the hope
to catch more errors where the paged-out state of a p2mt is not
checked.
While doing that, xenpaging got into an endless loop because some
pages cant be paged out right away. Now the policy reports an error if
the gfn number wraps.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 26 Nov 2010 10:10:40 +0000 (10:10 +0000)]
Allow assign_irq_vector to return an old vector while moving an irq
The guest calls assign_irq_vector() to assign one if it doesn't have
one, and to find out the vector if it does have one.
If the cpu mask passed intersects with the existing mask, the old
vector is simply returned.
However, if the irq happens to be in transit at the time, this returns
EBUSY. This is unnecessary if, as soon as the irq migration succeeds,
the logic would just return the old vector anyway.
This patch makes two changes:
* Switch the checks, so if the mask overlaps it always returns
* Return -EAGAIN instead of -EBUSY for moving irqs, to let the caller
know that the failure is temporary and may work if repeated.
This fixes a bug where on certain hardware, using the credit2
scheduler, a pvops kernel with multiple vcpus doesn't boot.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Fri, 26 Nov 2010 10:07:57 +0000 (10:07 +0000)]
cpupool: Simplify locking, use refcounts for cpupool liveness.
Signed-off-by: Keir Fraser <keir@xen.org>
Tim Deegan [Wed, 24 Nov 2010 10:20:03 +0000 (10:20 +0000)]
x86/mm: remove incorrect BUG_ON.
This BUG_ON tests a property of an effectively random PFN in the guest,
and is explicitly _not_ seeing the MFN that's known to be owned.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Ian Jackson [Tue, 23 Nov 2010 19:36:14 +0000 (19:36 +0000)]
tools/xl: show shutdown reason code, improve xl list heading
Previously, xl list would not reveal the shutdown reason code unless
it was SHUTDOWN_crashed. This is unfortunate; it makes it hard for
scripts which use xl to tell what's going on.
In this patch:
* xl list shows the reason code as a single letter if it is
any of the defined values from sched.h:
- poweroff or domain not shut down
r reboot
s suspend
c crashed
w watchdog
This is not 100% backward-compatible with xm but I think it's a
justifiable improvement. It would be nice to make the same change
to xm.
* xl list -v shows the full numeric reason code in hex, or "-" if the
domain is not shut down.
* xl list -v has column headings for the UUID and numeric reason
code. The heading for the reason code overlaps with the UUID a bit.
These headings are intended for human readers; scripts can parse
the output by breaking on whitespace.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Tue, 23 Nov 2010 19:35:09 +0000 (19:35 +0000)]
tools: Add xl python bindings for cpumap
Signed-off-by: juergen.gross@ts.fujitsu.com
Tim Deegan [Tue, 23 Nov 2010 19:33:30 +0000 (19:33 +0000)]
tools/xl: only use the special "--incoming" name on actual migration
Only use the special "--incoming" name on actual migration
and not on every susequent reboot.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tim Deegan [Tue, 23 Nov 2010 19:31:29 +0000 (19:31 +0000)]
tools/xl: VMs that are started paused shouldn't reboot paused too.
Otherwise a VM that's been migrated won't ever reboot cleanly again.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tim Deegan [Tue, 23 Nov 2010 19:30:27 +0000 (19:30 +0000)]
tools/xl: make it explicit that we migrate from stdin.
No semantic changes, just makes things a bit less confusing.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 23 Nov 2010 19:29:13 +0000 (19:29 +0000)]
tools: fetch remote changesets when force refetching/resetting qemu
This makes "make tools/ioemu-dir-force-update" usable for picking up
an entirely new QEMU_TAG.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 23 Nov 2010 19:28:03 +0000 (19:28 +0000)]
ocaml: install built modules
Previously the install target was having no effect because it ended up
calling the default target in the subdir Makefile instead of the
install target.
Resolve this by tying the tools/ocaml Makefiles into the generic
handling done by tools/Rules.mk.
Other changes arising in one way or another from this:
- Add libs/xl/META.in
- Update .hgignore for META files
- Create leading directories
- Remove existing module before installation in install targer
(worksaround what appears to be a quirk of "ocamlfind install")
- Use the globally defined $(DESTDIR)
- Move "ocamlfind printfconf destdir" to a common variable,
repurposing exising unused OCAMLDESTDIR, incorporating $(DESTDIR) at
the same time.
- Drop a few unused variabe definitions (mainly to avoid deciding if
$(DESTDIR) made sense for them or not.
- Pass -destdir to ocamlfind in uninstall target for symmetry with
install target.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Stefano Stabellini [Tue, 23 Nov 2010 19:25:00 +0000 (19:25 +0000)]
libxl, xl: Account for shadow memory for PV guests too
We need to account for the memory needed by shadow pagetables even for PV
guests, because in that case shadow pagetables are used during live
migration.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Stefano Stabellini [Tue, 23 Nov 2010 19:23:22 +0000 (19:23 +0000)]
tools/libxl: use qdisk if blktap2 is not available
Whenever blktap2 is not available use qdisk as block backend instead.
[ This feature will only work with the relevant changesets from
qemu-xen-unstable, recently applied. -iwj ]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Jackson [Tue, 23 Nov 2010 19:21:22 +0000 (19:21 +0000)]
QEMU_TAG update
Ian Jackson [Tue, 23 Nov 2010 19:12:55 +0000 (19:12 +0000)]
Reapply
61c0c52a8c6c "qemu-xen: build adjustments"
The changeset
qemu-xen: build adjustments to support out-of-tree builds
works after all. Sorry for the noise.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 23 Nov 2010 18:38:16 +0000 (18:38 +0000)]
Revert
61c0c52a8c6c "qemu-xen: build adjustments"
It appears that the changeset
qemu-xen: build adjustments to support out-of-tree builds
broke the build.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 23 Nov 2010 16:43:38 +0000 (16:43 +0000)]
qemu-xen: build adjustments to support out-of-tree builds
QEMU by itself can be built outside of its source directory. With the
qemu repository being separate from the hypervisor/tools one it seems
to make sense to make use of this feature, but doing so requires a
couple of adjustments to the Xen changes to it. Basically, if
CONFIG_QEMU is found to indicate an existing directory, this directory
will be used rather than cloning the git repo into the build tree.
[ This changeset is the xen-unstable part of the patch but also
includes the QEMU_TAG update to pull in the qemu part. -iwj ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Mon, 22 Nov 2010 19:16:34 +0000 (19:16 +0000)]
x86 hvm: Fix VPMU issue on Nehalem cpus
Fix an issue on Nehalem cpus where performance counter overflows may
lead to endless interrupt loops on this cpu.
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Keir Fraser [Mon, 22 Nov 2010 19:13:00 +0000 (19:13 +0000)]
x86: Check for MWAIT in CPUID before using it in ACPI idle code.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Mon, 22 Nov 2010 08:29:03 +0000 (08:29 +0000)]
event_channel: Fix uninitialised variable build error.
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Ian Jackson [Fri, 19 Nov 2010 18:55:18 +0000 (18:55 +0000)]
QEMU_TAG update
Jan Beulich [Fri, 19 Nov 2010 18:39:33 +0000 (18:39 +0000)]
stubdom: allow building with read-only sources
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Fri, 19 Nov 2010 18:15:14 +0000 (18:15 +0000)]
libxc: fix tracing (broken with hypercall buffers)
the attached patch makes Xen tracing work again, after the introduction
of the hypercall buffers broke it. Just a missing line.
Thanks to Uwe Dannowski for reporting this.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 19 Nov 2010 13:49:54 +0000 (13:49 +0000)]
x86-64: Re-map all physdev_op struct names in compat shim.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Fri, 19 Nov 2010 13:45:08 +0000 (13:45 +0000)]
Introduce PHYSDEVOP_get_free_pirq
Introduce a new physdev_op called PHYSDEVOP_get_free_pirq to allow a
guest to get a free pirq number from Xen; the hypervisor would keep
that pirq free for the guest to use in a mapping.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 19 Nov 2010 13:43:24 +0000 (13:43 +0000)]
Interrupt remapping to PIRQs in HVM guests
This patch allows HVM guests to remap interrupts and MSIs into pirqs;
once the mapping is in place the guest will receive the interrupt (or
the MSI) as an event. The interrupt to be remapped can either be an
interrupt of an emulated device or an interrupt of a passthrough
device and we keep track of that.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 19 Nov 2010 13:26:42 +0000 (13:26 +0000)]
VPMU: Add the Intel CPU X7542 to the list of supported prcocessors
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Keir Fraser [Fri, 19 Nov 2010 13:25:54 +0000 (13:25 +0000)]
consolidate custom parameter parsing routines looking for boolean values
Have a single function for this, rather than doing the same in half a
dozen places.
Signed-off-by: Jan Beulich <jbeulich@novell.com>